NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Smoothed Analysis for Learning Concepts with Low Intrinsic Dimension

Chandrasekaran, G; Klivans, A; Kontonis, V; Meka, R; Stavropoulos, K (April 2025, https://doi.org/10.48550/arXiv.2407.00966)

In traditional models of supervised learning, the goal of a learner -- given examples from an arbitrary joint distribution on ℝd×{±1} -- is to output a hypothesis that is competitive (to within ϵ) of the best fitting concept from some class. In order to escape strong hardness results for learning even simple concept classes, we introduce a smoothed-analysis framework that requires a learner to compete only with the best classifier that is robust to small random Gaussian perturbation. This subtle change allows us to give a wide array of learning results for any concept that (1) depends on a low-dimensional subspace (aka multi-index model) and (2) has a bounded Gaussian surface area. This class includes functions of halfspaces and (low-dimensional) convex sets, cases that are only known to be learnable in non-smoothed settings with respect to highly structured distributions such as Gaussians. Surprisingly, our analysis also yields new results for traditional non-smoothed frameworks such as learning with margin. In particular, we obtain the first algorithm for agnostically learning intersections of k-halfspaces in time kpoly(logkϵγ) where γ is the margin parameter. Before our work, the best-known runtime was exponential in k (Arriaga and Vempala, 1999).
more » « less
Free, publicly-accessible full text available April 30, 2026
Learning Neural Networks with Distribution Shift: Efficiently Certifiable Guarantees

Chandrasekaran, G; Klivans, A R; Lee, L L; Stavropoulos, K (February 2025, https://doi.org/10.48550/arXiv.2502.16021)

We give the first provably efficient algorithms for learning neural networks with distribution shift. We work in the Testable Learning with Distribution Shift framework (TDS learning) of Klivans et al. (2024), where the learner receives labeled examples from a training distribution and unlabeled examples from a test distribution and must either output a hypothesis with low test error or reject if distribution shift is detected. No assumptions are made on the test distribution. All prior work in TDS learning focuses on classification, while here we must handle the setting of nonconvex regression. Our results apply to real-valued networks with arbitrary Lipschitz activations and work whenever the training distribution has strictly sub-exponential tails. For training distributions that are bounded and hypercontractive, we give a fully polynomial-time algorithm for TDS learning one hidden-layer networks with sigmoid activations. We achieve this by importing classical kernel methods into the TDS framework using data-dependent feature maps and a type of kernel matrix that couples samples from both train and test distributions.
more » « less
Free, publicly-accessible full text available February 22, 2026
Learning the Sherrington-Kirkpatrick Model Even at Low Temperature

Chandrasekaran, G; Klivans, A (November 2024, https://doi.org/10.48550/arXiv.2411.11174)

We consider the fundamental problem of learning the parameters of an undirected graphical model or Markov Random Field (MRF) in the setting where the edge weights are chosen at random. For Ising models, we show that a multiplicative-weight update algorithm due to Klivans and Meka learns the parameters in polynomial time for any inverse temperature β≤logn‾‾‾‾‾√. This immediately yields an algorithm for learning the Sherrington-Kirkpatrick (SK) model beyond the high-temperature regime of β<1. Prior work breaks down at β=1 and requires heavy machinery from statistical physics or functional inequalities. In contrast, our analysis is relatively simple and uses only subgaussian concentration. Our results extend to MRFs of higher order (such as pure p-spin models), where even results in the high-temperature regime were not known.
more » « less
Full Text Available

Search for: All records